Pareto distribution (pareto) — heavy-tailed power laws#
The Pareto (Type I) distribution is a continuous distribution on \((x_m, \infty)\) with a power-law tail. It is the canonical model behind the Pareto principle (“80/20 rule”) and appears whenever a small number of observations dominate totals.
What you’ll learn#
how the PDF/CDF encode a power-law tail and a simple quantile function
which moments exist (and why some are infinite)
mean/variance/skewness/kurtosis, entropy, and what happens to the MGF
a clean NumPy-only sampler (inverse transform) and Monte Carlo validation
maximum likelihood estimation (MLE) and how it relates to exponentials on the log scale
practical usage via
scipy.stats.pareto(pdf,cdf,rvs,fit)
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
import scipy
from scipy import stats
from scipy.integrate import quad
# Plotly rendering (CKC convention)
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
# Reproducibility
rng = np.random.default_rng(7)
np.set_printoptions(precision=6, suppress=True)
print("numpy:", np.__version__)
print("scipy:", scipy.__version__)
numpy: 1.26.2
scipy: 1.15.0
1) Title & Classification#
Name:
pareto(Pareto Type I; SciPy:scipy.stats.pareto)Type: Continuous
Support: \(x \in [x_m, \infty)\)
Parameter space:
shape (tail index) \(\alpha > 0\)
scale / minimum \(x_m > 0\)
We write:
SciPy parameterization (mapping)
SciPy’s stats.pareto(b, loc=0, scale=1) uses a shape parameter b and then applies an affine transformation.
With loc=0 and scale=x_m, SciPy’s b corresponds to our \(\alpha\).
2) Intuition & Motivation#
2.1 What it models#
The Pareto distribution models positive quantities with a hard lower bound and a heavy right tail.
A key feature is its scale-free tail:
On a log–log plot, this tail is a straight line with slope \(-\alpha\).
2.2 Typical real-world use cases#
Wealth / income above a minimum threshold (upper tail behavior)
Insurance claims and large losses (catastrophic tail)
City sizes and firm sizes (upper tails)
File sizes / network traffic (bursty, heavy-tailed phenomena)
Waiting times in systems with rare extreme delays
2.3 Relations to other distributions#
Power law / “scale-free”: Pareto Type I is the canonical continuous power-law model.
Log transform to exponential: if \(X\sim\mathrm{Pareto}(\alpha,x_m)\) then $\(\log\left(\frac{X}{x_m}\right) \sim \mathrm{Exp}(\text{rate}=\alpha).\)$ This is extremely useful for inference.
Lomax (Pareto II): if \(X\sim\mathrm{Pareto}(\alpha,x_m)\) then \(X-x_m\) follows a Lomax distribution.
Generalized Pareto (GPD): Pareto is a special/heavy-tailed case closely related to extreme-value modeling.
3) Formal Definition#
Let \(\alpha>0\) and \(x_m>0\). The PDF is
The CDF is
The survival function (tail probability) is
The quantile function (inverse CDF) for \(0<u<1\) is
def pareto_pdf(x: np.ndarray, alpha: float, xm: float) -> np.ndarray:
'''Pareto(Type I) PDF. Returns 0 for x < xm.'''
x = np.asarray(x, dtype=float)
alpha = float(alpha)
xm = float(xm)
if alpha <= 0 or xm <= 0:
raise ValueError("Pareto parameters must satisfy alpha>0 and xm>0")
out = np.zeros_like(x, dtype=float)
mask = x >= xm
# Use log space for stability
log_pdf = np.log(alpha) + alpha * np.log(xm) - (alpha + 1) * np.log(x[mask])
out[mask] = np.exp(log_pdf)
return out
def pareto_sf(x: np.ndarray, alpha: float, xm: float) -> np.ndarray:
'''Survival function P(X>x).'''
x = np.asarray(x, dtype=float)
alpha = float(alpha)
xm = float(xm)
if alpha <= 0 or xm <= 0:
raise ValueError("Pareto parameters must satisfy alpha>0 and xm>0")
out = np.ones_like(x, dtype=float)
mask = x >= xm
z = alpha * (np.log(xm) - np.log(x[mask])) # <= 0
out[mask] = np.exp(z)
out[~mask] = 1.0
return out
def pareto_cdf(x: np.ndarray, alpha: float, xm: float) -> np.ndarray:
'''Pareto(Type I) CDF.'''
x = np.asarray(x, dtype=float)
alpha = float(alpha)
xm = float(xm)
if alpha <= 0 or xm <= 0:
raise ValueError("Pareto parameters must satisfy alpha>0 and xm>0")
out = np.zeros_like(x, dtype=float)
mask = x >= xm
z = alpha * (np.log(xm) - np.log(x[mask])) # <= 0
out[mask] = -np.expm1(z) # 1 - exp(z), stable when z ~ 0
return out
def pareto_ppf(u: np.ndarray, alpha: float, xm: float) -> np.ndarray:
'''Quantile function (inverse CDF) for 0<u<1.'''
u = np.asarray(u, dtype=float)
alpha = float(alpha)
xm = float(xm)
if alpha <= 0 or xm <= 0:
raise ValueError("Pareto parameters must satisfy alpha>0 and xm>0")
if np.any((u <= 0) | (u >= 1)):
raise ValueError("u must lie strictly in (0,1)")
# Q(u) = xm * (1-u)^(-1/alpha)
return xm * np.exp(-np.log1p(-u) / alpha)
def pareto_rvs(alpha: float, xm: float, size: int | tuple[int, ...], rng: np.random.Generator) -> np.ndarray:
'''NumPy-only Pareto sampler via inverse transform.'''
u = rng.random(size)
# u in [0,1); use 1-u in (0,1] so u==0 maps safely to xm.
return xm * np.exp(-np.log1p(-u) / float(alpha))
# Quick sanity check: PDF integrates to 1
alpha0, xm0 = 2.5, 1.0
area, err = quad(lambda t: alpha0 * xm0**alpha0 / (t ** (alpha0 + 1)), xm0, np.inf)
area, err
(0.9999999999999999, 1.1102230246251563e-15)
4) Moments & Properties#
4.1 Raw moments (and when they exist)#
For \(k<\alpha\) the \(k\)-th raw moment exists and is
If \(k\ge \alpha\), the integral diverges and \(\mathbb{E}[X^k]=\infty\).
4.2 Mean, variance, skewness, kurtosis#
For \(X\sim\mathrm{Pareto}(\alpha,x_m)\):
Mean (exists for \(\alpha>1\)) $\(\mathbb{E}[X] = \frac{\alpha\,x_m}{\alpha-1}.\)$
Variance (exists for \(\alpha>2\)) $\(\mathrm{Var}(X) = \frac{\alpha\,x_m^2}{(\alpha-1)^2(\alpha-2)}.\)$
Skewness (exists for \(\alpha>3\)) $\(\gamma_1 = \frac{2(\alpha+1)}{\alpha-3}\sqrt{\frac{\alpha-2}{\alpha}}.\)$
Excess kurtosis (exists for \(\alpha>4\)) $\(\gamma_2 = \frac{6(\alpha^3+\alpha^2-6\alpha-2)}{\alpha(\alpha-3)(\alpha-4)}.\)$
Additional useful facts:
Mode: \(x_m\)
Median: \(x_m\,2^{1/\alpha}\)
4.3 MGF / characteristic function#
The MGF \(M(t)=\mathbb{E}[e^{tX}]\) diverges for all \(t>0\) (the tail is polynomial).
For \(t<0\), the Laplace transform exists and can be expressed using the upper incomplete gamma function \(\Gamma(s,z)\):
\[M(t) = \alpha(-t x_m)^{\alpha}\,\Gamma(-\alpha, -t x_m),\qquad t<0.\]The characteristic function exists for all real \(t\) and admits a similar expression:
\[\varphi(t) = \alpha(-i t x_m)^{\alpha}\,\Gamma(-\alpha, -i t x_m).\]
In practice, these are often evaluated numerically.
4.4 Entropy#
The differential entropy is
def pareto_moments(alpha: float, xm: float) -> dict:
a = float(alpha)
xm = float(xm)
if a <= 0 or xm <= 0:
raise ValueError("Pareto parameters must satisfy alpha>0 and xm>0")
mean = np.inf if a <= 1 else a * xm / (a - 1)
var = np.inf if a <= 2 else a * xm**2 / ((a - 1) ** 2 * (a - 2))
skew = np.nan if a <= 3 else (2 * (a + 1) / (a - 3)) * np.sqrt((a - 2) / a)
excess_kurt = (
np.nan
if a <= 4
else 6 * (a**3 + a**2 - 6 * a - 2) / (a * (a - 3) * (a - 4))
)
entropy = np.log(xm / a) + 1 + 1 / a
median = xm * 2 ** (1 / a)
mode = xm
return {
"mean": mean,
"var": var,
"skew": skew,
"excess_kurtosis": excess_kurt,
"entropy": entropy,
"median": median,
"mode": mode,
}
alpha0, xm0 = 2.5, 1.0
theo = pareto_moments(alpha0, xm0)
dist = stats.pareto(b=alpha0, loc=0, scale=xm0)
m, v, s, k = dist.stats(moments="mvsk")
theo, {
"scipy_mean": float(m),
"scipy_var": float(v),
"scipy_skew": float(s),
"scipy_excess_kurt": float(k),
"scipy_entropy": float(dist.entropy()),
}
({'mean': 1.6666666666666667,
'var': 2.2222222222222223,
'skew': nan,
'excess_kurtosis': nan,
'entropy': 0.483709268125845,
'median': 1.3195079107728942,
'mode': 1.0},
{'scipy_mean': 1.6666666666666667,
'scipy_var': 2.2222222222222223,
'scipy_skew': nan,
'scipy_excess_kurt': nan,
'scipy_entropy': 0.4837092681258448})
# Moment existence in action: running averages can be unstable when moments are infinite.
def running_mean(x: np.ndarray) -> np.ndarray:
return np.cumsum(x) / np.arange(1, len(x) + 1)
xm_demo = 1.0
alphas_demo = [0.8, 1.5, 3.0]
n_demo = 30_000
fig = go.Figure()
for a in alphas_demo:
x = pareto_rvs(alpha=a, xm=xm_demo, size=n_demo, rng=rng)
rm = running_mean(x)
fig.add_trace(
go.Scatter(
x=np.arange(1, n_demo + 1),
y=rm,
mode="lines",
name=f"alpha={a}",
)
)
fig.update_layout(
title="Running sample mean for different tail indices (same xm=1)",
xaxis_title="n",
yaxis_title="running mean",
yaxis_type="log",
width=950,
height=450,
)
fig
5) Parameter Interpretation#
5.1 Shape \(\alpha\) (tail index)#
The survival function is
So \(\alpha\) directly controls the tail heaviness:
smaller \(\alpha\) → heavier tail → extreme values are more common
larger \(\alpha\) → lighter tail → extremes decay faster
Moment existence is controlled by \(\alpha\):
mean exists iff \(\alpha>1\)
variance exists iff \(\alpha>2\)
skewness exists iff \(\alpha>3\)
excess kurtosis exists iff \(\alpha>4\)
5.2 Scale/minimum \(x_m\)#
The parameter \(x_m\) is a hard lower bound. Increasing \(x_m\) scales the distribution to the right.
If \(X\sim\mathrm{Pareto}(\alpha, x_m)\) then \(X/x_m\sim\mathrm{Pareto}(\alpha, 1)\).
# Shape changes: PDF curves for different alpha
xm = 1.0
x = np.geomspace(xm, 80 * xm, 600)
alphas = [0.8, 1.2, 2.0, 5.0]
fig = go.Figure()
for a in alphas:
fig.add_trace(go.Scatter(x=x, y=pareto_pdf(x, a, xm), mode="lines", name=f"alpha={a}"))
fig.update_layout(
title="Pareto PDF for different alpha (xm=1)",
xaxis_title="x",
yaxis_title="density",
xaxis_type="log",
yaxis_type="log",
width=950,
height=450,
)
fig
# Tail straight line on log-log scale: survival function
xm = 1.0
x = np.geomspace(xm, 500 * xm, 600)
alphas = [1.2, 2.0, 4.0]
fig = go.Figure()
for a in alphas:
fig.add_trace(go.Scatter(x=x, y=pareto_sf(x, a, xm), mode="lines", name=f"alpha={a}"))
fig.update_layout(
title="Pareto survival function P(X>x): log-log straight lines",
xaxis_title="x",
yaxis_title="P(X>x)",
xaxis_type="log",
yaxis_type="log",
width=950,
height=450,
)
fig
6) Derivations#
6.1 Expectation (general \(k\)-th moment)#
For \(k\in\mathbb{R}\):
The integral converges iff \(k-\alpha-1 < -1\), i.e. iff \(k<\alpha\). When it converges:
Plugging in \(k=1\) gives the mean (requires \(\alpha>1\)).
6.2 Variance#
For \(\alpha>2\) we have
Using \(\mathbb{E}[X^2] = \frac{\alpha x_m^2}{\alpha-2}\) and \(\mathbb{E}[X]=\frac{\alpha x_m}{\alpha-1}\) yields
6.3 Likelihood and MLE#
For i.i.d. data \(x_1,\dots,x_n\) with \(x_i\ge x_m\):
The log-likelihood is
with the constraint \(x_m\le \min_i x_i\).
If \(x_m\) is known, differentiate w.r.t. \(\alpha\) and set to zero:
If \(x_m\) is unknown, the likelihood increases with \(x_m\) (for fixed \(\alpha\)) subject to \(x_m\le \min_i x_i\), so
and then plug into the formula for \(\hat{\alpha}\).
Log-scale trick
Define \(Y_i = \log(x_i/x_m)\). Under the Pareto model, \(Y_i\sim\mathrm{Exp}(\text{rate}=\alpha)\). Inference for \(\alpha\) can be done using exponential-family tools.
def pareto_mle_alpha(x: np.ndarray, xm: float) -> float:
x = np.asarray(x, dtype=float)
xm = float(xm)
if xm <= 0:
raise ValueError("xm must be > 0")
if np.any(x < xm):
raise ValueError("All observations must satisfy x >= xm")
s = np.sum(np.log(x / xm))
if s <= 0:
raise ValueError("Sum log(x/xm) must be positive")
return len(x) / s
def pareto_mle(x: np.ndarray) -> tuple[float, float]:
x = np.asarray(x, dtype=float)
xm_hat = float(np.min(x))
alpha_hat = pareto_mle_alpha(x, xm_hat)
return alpha_hat, xm_hat
# Demonstrate MLE on synthetic data
alpha_true, xm_true = 2.5, 1.0
n = 4000
x = pareto_rvs(alpha=alpha_true, xm=xm_true, size=n, rng=rng)
alpha_hat, xm_hat = pareto_mle(x)
scipy_b_hat, scipy_loc_hat, scipy_scale_hat = stats.pareto.fit(x, floc=0)
{
"true": (alpha_true, xm_true),
"mle": (alpha_hat, xm_hat),
"scipy_fit_floc0": (float(scipy_b_hat), float(scipy_scale_hat)),
}
{'true': (2.5, 1.0),
'mle': (2.523314680630405, 1.000025688150666),
'scipy_fit_floc0': (2.523314680630405, 1.000025688150666)}
# Likelihood shape in alpha (xm fixed at its MLE)
def pareto_loglik_alpha(alpha: float, x: np.ndarray, xm: float) -> float:
a = float(alpha)
if a <= 0:
return -np.inf
if np.any(x < xm):
return -np.inf
n = len(x)
return n * np.log(a) + n * a * np.log(xm) - (a + 1) * np.sum(np.log(x))
grid = np.linspace(0.2, 8.0, 400)
ll = np.array([pareto_loglik_alpha(a, x, xm_hat) for a in grid])
fig = px.line(x=grid, y=ll, title="Log-likelihood as a function of alpha (xm fixed)")
fig.add_vline(alpha_hat, line_dash="dash", line_color="black")
fig.update_layout(xaxis_title="alpha", yaxis_title="log-likelihood", width=950, height=420)
fig
7) Sampling & Simulation#
7.1 Inverse transform sampling (NumPy-only)#
From the CDF \(F(x)=1-(x_m/x)^{\alpha}\) for \(x\ge x_m\):
Solve for \(X\):
Algorithm
Draw \(U\sim\mathrm{Unif}(0,1)\).
Return \(X = x_m(1-U)^{-1/\alpha}\).
7.2 Exponential connection (also NumPy-only)#
Since \(\log(X/x_m)\sim\mathrm{Exp}(\text{rate}=\alpha)\), you can also sample via
def pareto_rvs_via_exponential(alpha: float, xm: float, size: int | tuple[int, ...], rng: np.random.Generator) -> np.ndarray:
e = rng.exponential(scale=1.0, size=size)
return float(xm) * np.exp(e / float(alpha))
# Validate that both samplers agree (Monte Carlo)
alpha0, xm0 = 2.0, 1.0
n = 100_000
x1 = pareto_rvs(alpha0, xm0, size=n, rng=rng)
x2 = pareto_rvs_via_exponential(alpha0, xm0, size=n, rng=rng)
# Compare a few quantiles
qs = [0.5, 0.9, 0.99]
q1 = np.quantile(x1, qs)
q2 = np.quantile(x2, qs)
qt = pareto_ppf(np.array(qs), alpha0, xm0)
qs, q1, q2, qt
([0.5, 0.9, 0.99],
array([1.415793, 3.132384, 9.755368]),
array([1.413034, 3.147542, 9.814283]),
array([ 1.414214, 3.162278, 10. ]))
# The log-scale exponential relationship
alpha0, xm0 = 3.0, 1.0
x = pareto_rvs(alpha0, xm0, size=50_000, rng=rng)
y = np.log(x / xm0)
grid = np.linspace(0, np.quantile(y, 0.995), 500)
exp_pdf = alpha0 * np.exp(-alpha0 * grid) # Exp(rate=alpha0)
fig = go.Figure()
fig.add_trace(go.Histogram(x=y, nbinsx=80, histnorm="probability density", name="log(X/xm)", opacity=0.6))
fig.add_trace(go.Scatter(x=grid, y=exp_pdf, mode="lines", name="Exp(rate=alpha)", line=dict(width=3)))
fig.update_layout(
title="If X ~ Pareto(alpha,xm), then log(X/xm) ~ Exponential(rate=alpha)",
xaxis_title="y = log(X/xm)",
yaxis_title="density",
width=950,
height=420,
)
fig
8) Visualization#
We’ll visualize:
PDF (log–log to emphasize tail)
CDF
Monte Carlo samples vs theory
alpha, xm = 2.5, 1.0
x = np.geomspace(xm, 200 * xm, 700)
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=pareto_pdf(x, alpha, xm), mode="lines", name="PDF"))
fig.update_layout(
title=f"Pareto PDF (alpha={alpha}, xm={xm})",
xaxis_title="x",
yaxis_title="density",
xaxis_type="log",
yaxis_type="log",
width=950,
height=420,
)
fig
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=pareto_cdf(x, alpha, xm), mode="lines", name="CDF"))
fig.update_layout(
title=f"Pareto CDF (alpha={alpha}, xm={xm})",
xaxis_title="x",
yaxis_title="F(x)",
xaxis_type="log",
width=950,
height=420,
)
fig
# Monte Carlo: histogram + theoretical PDF overlay
n = 80_000
xs = pareto_rvs(alpha, xm, size=n, rng=rng)
x_max = np.quantile(xs, 0.995)
x_grid = np.geomspace(xm, x_max, 600)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=xs,
nbinsx=120,
histnorm="probability density",
name="samples",
opacity=0.6,
)
)
fig.add_trace(go.Scatter(x=x_grid, y=pareto_pdf(x_grid, alpha, xm), mode="lines", name="theory", line=dict(width=3)))
fig.update_layout(
title=f"Samples vs theory (alpha={alpha}, xm={xm})",
xaxis_title="x",
yaxis_title="density",
xaxis_type="log",
yaxis_type="log",
width=950,
height=450,
)
fig
9) SciPy Integration#
SciPy’s Pareto is used as:
dist = scipy.stats.pareto(b=alpha, loc=0, scale=xm)
With loc=0, this matches the textbook Pareto(Type I) on \([x_m,\infty)\).
alpha, xm = 2.5, 1.0
dist = stats.pareto(b=alpha, loc=0, scale=xm)
x = np.geomspace(xm, 50 * xm, 300)
# pdf / cdf
pdf_vals = dist.pdf(x)
cdf_vals = dist.cdf(x)
# sampling
xs = dist.rvs(size=5, random_state=rng)
pdf_vals[:3], cdf_vals[:3], xs
(array([2.5 , 2.388099, 2.281208]),
array([0. , 0.03218 , 0.063325]),
array([1.535648, 1.472583, 1.287302, 1.250062, 1.510556]))
# Fitting
alpha_true, xm_true = 2.0, 1.5
x = stats.pareto(b=alpha_true, loc=0, scale=xm_true).rvs(size=5000, random_state=rng)
# If you allow loc to float, it may absorb part of the minimum; usually fix loc=0.
b_hat, loc_hat, scale_hat = stats.pareto.fit(x) # free loc and scale
b_hat0, loc_hat0, scale_hat0 = stats.pareto.fit(x, floc=0) # force loc=0
# Compare to MLE (xm_hat=min(x))
alpha_mle, xm_mle = pareto_mle(x)
{
"true": (alpha_true, xm_true),
"fit_free_loc": (float(b_hat), float(loc_hat), float(scale_hat)),
"fit_floc0": (float(b_hat0), float(loc_hat0), float(scale_hat0)),
"mle": (alpha_mle, xm_mle),
}
{'true': (2.0, 1.5),
'fit_free_loc': (1.9473724039537028, 0.02377036106946595, 1.476315778565303),
'fit_floc0': (1.9680604915855544, 0.0, 1.5000861396347691),
'mle': (1.9680604915855544, 1.5000861396347691)}
10) Statistical Use Cases#
10.1 Hypothesis testing / confidence intervals (tail index)#
With known \(x_m\), define \(Y_i = \log(x_i/x_m)\). Under the Pareto model, \(Y_i\sim\mathrm{Exp}(\text{rate}=\alpha)\).
Let \(S = \sum_i Y_i\). Then \(S\sim\mathrm{Gamma}(\text{shape}=n,\text{rate}=\alpha)\), and
This gives an exact confidence interval for \(\alpha\) by inverting chi-square quantiles.
10.2 Bayesian modeling#
Again with \(Y_i=\log(x_i/x_m)\), the likelihood is exponential with rate \(\alpha\).
A conjugate prior is Gamma:
Posterior is
10.3 Generative modeling (Pareto principle)#
For \(\alpha>1\), the Pareto distribution implies a Lorenz curve
The share of total mass held by the top fraction \(q\) is
This connects \(\alpha\) directly to “80/20”-style statements.
from scipy.stats import chi2
# Hypothesis test + exact CI for alpha (assuming xm known)
alpha_true, xm = 2.2, 1.0
n = 800
x = stats.pareto(b=alpha_true, loc=0, scale=xm).rvs(size=n, random_state=rng)
y = np.log(x / xm)
S = float(np.sum(y))
alpha_hat = n / S
# 95% exact CI via chi-square inversion
delta = 0.05
q_low = chi2.ppf(delta / 2, df=2 * n)
q_high = chi2.ppf(1 - delta / 2, df=2 * n)
alpha_low = q_low / (2 * S)
alpha_high = q_high / (2 * S)
# Two-sided p-value for H0: alpha = alpha0
alpha0 = 2.0
T = 2 * alpha0 * S
p_two_sided = 2 * min(chi2.cdf(T, df=2 * n), 1 - chi2.cdf(T, df=2 * n))
{
"alpha_hat": alpha_hat,
"ci_95": (alpha_low, alpha_high),
"test_H0_alpha=2.0_p": p_two_sided,
}
{'alpha_hat': 2.1500942564639236,
'ci_95': (2.003664756067939, 2.301614469333093),
'test_H0_alpha=2.0_p': 0.04437503708196632}
# Bayesian update for alpha with a Gamma prior (shape-rate)
a0, b0 = 2.0, 1.0 # weak-ish prior
a_post = a0 + n
b_post = b0 + S
prior = stats.gamma(a=a0, scale=1 / b0)
post = stats.gamma(a=a_post, scale=1 / b_post)
grid = np.linspace(0.2, np.quantile(post.rvs(size=50_000, random_state=rng), 0.995), 600)
fig = go.Figure()
fig.add_trace(go.Scatter(x=grid, y=prior.pdf(grid), mode="lines", name=f"Prior Gamma({a0:.0f},{b0:.0f})"))
fig.add_trace(go.Scatter(x=grid, y=post.pdf(grid), mode="lines", name=f"Posterior Gamma({a_post:.0f},{b_post:.1f})", line=dict(width=3)))
fig.update_layout(
title="Posterior over alpha (Pareto tail index)",
xaxis_title="alpha",
yaxis_title="density",
width=950,
height=420,
)
fig
# Generative modeling: calibrate alpha to match a target top-share (e.g. 80/20)
target_q = 0.2
target_share = 0.8
# Solve target_share = q^((alpha-1)/alpha)
# Let r = (alpha-1)/alpha = 1 - 1/alpha.
r = np.log(target_share) / np.log(target_q)
alpha_8020 = 1 / (1 - r)
alpha_8020
1.160964047443681
# Simulate and compare empirical top-share and Lorenz curve
alpha, xm = float(alpha_8020), 1.0
n = 200_000
w = pareto_rvs(alpha, xm, size=n, rng=rng)
# Empirical top-q share
q = 0.2
cutoff = np.quantile(w, 1 - q)
top_share_emp = w[w >= cutoff].sum() / w.sum()
# Theoretical top share
top_share_theory = q ** ((alpha - 1) / alpha)
top_share_emp, top_share_theory
(0.776595898635988, 0.8)
# Lorenz curve: empirical vs theoretical
w_sorted = np.sort(w)
cum = np.cumsum(w_sorted)
cum = np.insert(cum, 0, 0.0)
cum = cum / cum[-1]
p = np.linspace(0, 1, len(cum))
lorenz_theory = 1 - (1 - p) ** ((alpha - 1) / alpha)
fig = go.Figure()
fig.add_trace(go.Scatter(x=p, y=cum, mode="lines", name="empirical"))
fig.add_trace(go.Scatter(x=p, y=lorenz_theory, mode="lines", name="theoretical", line=dict(dash="dash")))
fig.add_trace(go.Scatter(x=[0, 1], y=[0, 1], mode="lines", name="equality", line=dict(color="black", dash="dot")))
fig.update_layout(
title=f"Lorenz curve for Pareto(alpha={alpha:.3f})",
xaxis_title="population share p",
yaxis_title="wealth share L(p)",
width=950,
height=450,
)
fig